Novel Architectures for Unsupervised Information Bottleneck Based Speaker Diarization of Meetings
نویسندگان
چکیده
Speaker diarization is an important problem that topical, and especially useful as a preprocessor for conversational speech related applications. The objective of this article two-fold: (i) segment initialization by uniformly distributing speaker information across the initial segments, (ii) incorporating discriminative features within unsupervised framework. In first part work, varying length technique Information Bottleneck (IB) based system using phoneme rate side proposed. This distributes segments provides better starting point IB clustering. second we present Two-Pass (TPIB) incorporates during process diarization. TPIB has shown improvement over baseline system. During pass system, coarse segmentation performed alignments obtained are used to generate shallow feed-forward neural network linear discriminant analysis. in obtain final boundaries. paper, variable combined with leverages advantages results additional performance. An evaluation on standard meeting datasets shows significant absolute 3.9% 4.7% NIST AMI datasets, respectively.
منابع مشابه
Phoneme background model for information bottleneck based speaker diarization
Acoustic variability of speakers arises due to differences in their vocal tract characteristics. These individual speaker characteristics are reflected in a speech signal when speakers pronounce a given phoneme. The current work hypothesizes that clusters within a phoneme spoken by multiple speakers roughly correspond to different speakers. Based on this hypothesis, a Gaussian mixture model (GM...
متن کاملInformation Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings
Improved diarization results can be obtained through combination of multiple systems. Several combination techniques have been proposed based on output voting, initialization and also integrated approaches. This paper proposes and investigates a novel approach to combine diarization systems through the use of features. A first diarization system, based on the Information Bottleneck, is used to ...
متن کاملRobust Speaker Diarization for meetings
This thesis shows research performed into the topic of speaker diarization for meeting rooms. It looks into the algorithms and the implementation of an offline speaker segmentation and clustering system for a meeting recording where usually more than one microphone is available. The main research and system implementation has been done while visiting the International Computes Science Institute...
متن کاملSpeaker Diarization in Meetings Domain
The purpose of this study is to develop robust techniques for speaker segmentation and clustering with focus on meetings domain. The techniques examined can however be applied to any other domains such as telephone and broadcast news. Traditional techniques for speaker diarization developed for telephone conversations or broadcast news are based on a single channel, which is notably different f...
متن کاملUnsupervised Methods for Speaker Diarization
Given a stream of unlabeled audio data, speaker diarization is the process of determining “who spoke when.” We propose a novel approach to solving this problem by taking advantage of the effectiveness of factor analysis as a front-end for extracting speaker-specific features and exploiting the inherent variabilities in the data through the use of unsupervised methods. Upon initial evaluation, o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2021
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2020.3036231